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ABSTRACT 

Regulatory T-cells (Treg) play an essential role in the 
negative regulation of immune answers by develop- 
ing an attenuated cytokine response that allows 
suppressing proliferation and effector function of 
T-cells (CD4 + Th). The transcription factor FoxP3 is 
responsible for the regulation of many genes invol- 
ved in the Treg gene signature. Its ablation leads to 
severe immune deficiencies in human and mice. 
Recent developments in sequencing technologies 
have revolutionized the possibilities to gain insights 
into transcription factor binding by ChiP-seq and 
into transcriptome analysis by mRNA-seq. We 
combine FoxP3 ChiP-seq and mRNA-seq in order 
to understand the transcriptional differences 
between primary human CD4 + T helper and regula- 
tory T-cells, as well as to study the role of FoxP3 in 
generating those differences. We show, that mRNA- 
seq allows analyzing the transcriptomal landscape 
of T-cells including the expression of specific splice 
variants at much greater depth than previous 
approaches, whereas 50% of transcriptional regula- 
tion events have not been described before by using 
diverse array technologies. We discovered splicing 
patterns like the expression of a kinase-dead iso- 
form of IRAKI upon T-cell activation. The immuno- 
proteasome is up-regulated in both Treg and CD4 + 
Th cells upon activation, whereas the 'standard' 
proteasome is up-regulated in Tregs only upon 
activation. 



INTRODUCTION 

The recent years have shown that regulatory T-cells (Treg) 
play an essential role in the negative regulation of immune 
answers and the prevention of autoimmunity (1) by main- 
taining tolerance to self and controlling autoimmune de- 
viation. Treg cells have an attenuated cytokine response 
and can suppress proliferation and effector function of 
other T-cells. One important regulator to develop the 
Treg specific gene expression signature is the forkhead 
box transcription factor FoxP3, which is highly expressed 
in Treg cells, although it is also present at lower levels in 
effector T-cell populations. Deficiency in FoxP3 has been 
shown to underlie the lympho-proliferation and 
multi-organ autoimmunity of mutant mice and is linked 
with immunodysregulation polyendocrinopathy and the 
X-linked syndrome (IPEX) in humans (2). Although 
being a key regulator in Treg development, it becomes 
also increasingly evident that FoxP3 alone only accounts 
for part of the Treg signature and, for example, the sup- 
pression of IL2 and activation of IL2RA in T-cells are also 
found in FoxP3-deficient Scurfy mice (3). 

Treg cells differ significantly in their gene expression 
profile from CD4 + T helper cells, both, in resting and 
activated states. A large number of gene expression 
profiling studies (4-7) allowed for the identification of a 
signature of genes which are typically up- or 
down-regulated in Tregs. Those genes are involved in a 
variety of biologic processes and functions including cell 
surface and membrane proteins (TLR4, IL2RA, IL2RB or 
CTLA4), kinases (Map3K8), phosphatases (DUSP4) or 
transcription factors (FoxP3, IKZF2 and IKZF4). 

The role of FoxP3 in regulating the expression of Treg 
specific genes has further been elucidated by three studies 
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which used chromatin immimoprecipitation (ChiP) in con- 
junction with microarrays to identify chromosomal loca- 
tions of FoxP3 binding in mouse (8,9) and in human (10). 
They described a set of ~1400 mouse genes and 5579 
human genes which are bound by FoxP3. In combination 
with gene expression profiling, they identified genes which 
are subsequently de-regulated in their expression levels. 
Those studies showed that FoxP3 binding can explain ac- 
tivation or repression of a subset of genes involved in the 
Treg signature but also suggested that FoxP3 alone is not 
responsible for developing Treg cells. 

In addition, major advances in sequencing technologies 
also known as next or second generation sequencing 
(NGS) (11) allow for applications which go far beyond 
what has been possible only a few years ago. Those appli- 
cations include genome re-sequencing projects to identify 
genetic variation between parents and their children (12) 
or the analysis of folding principles of the human genome 
(13). To gain a deeper understanding of properties of 
a specific transcription factor, the combination of co- 
immunoprecipitation and subsequent DNA sequencing, 
also known as ChiP-seq (14,15), enables the unbiased 
identification of genomic regions bound by a transcription 
factor. NGS has also enabled us to perform gene expres- 
sion profiling (mRNA-seq) at an unknown depth, sensi- 
tivity and resolution including the identification of splice 
variants (16), expression profiles of single cells (17) or or- 
ganisms without prior genome sequence information (18). 

In this study, we use Illumina's Genome Analyzer 
platform to perform ChiP-seq of FoxP3-bound genomic 
regions and mRNA-seq for transcriptome profiling in 
samples of resting and activated primary CD4 + Th and 
Treg cells from human donors. Using this data set, we are 
able to show the influence of FoxP3 on gene expression 
patterns in human and also in comparison to published 
data sets in the mouse. Furthermore, our analysis enables 
very detailed insights into the transcriptomal landscape of 
resting and activated Treg and Teff cells including differ- 
ential expression of genes, splice variants and ncRNAs. 

MATERIALS AND METHODS 

Preparation of T-cell populations 

Leukapheresis products were obtained from adult healthy 
volunteers with approval by the ethical committee 
(Landesaerztekammer Baden-Wuertemberg). Human nat- 
urally occurring CD4 + CD25 + regulatory T-cells (Tregs) 
and untouched human CD4 + T helper cells (CD4 + Th) 
were isolated as described previously (19). Polyclonal 
T-cell activation was performed using soluble 1 iig/ml anti 
CD3 (clone OKT3) antibody (ebioscience) and 2 ug/ml anti 
CD28 (clone 28.2) antibody (ebioscience). 

FoxP3 ChiP 

Genpathway's FactorPath method was carried out as 
described by Labhart et al. (20). In brief, human T-cell 
populations, activated for 16 h, were fixed with 1% for- 
maldehyde for 15 min and quenched with 0.125 M glycine. 
Chromatin was isolated by adding lysis buffer, followed 
by disruption with a Dounce homogenizer (cells). Lysates 



were sonicated (Misonix) to shear the DNA to an average 
length of 300-500 bp. Genomic DNA (Input) was purified 
from an aliquot of chromatin and quantified on a 
Nanodrop spectrophotometer. Extrapolation to the 
original chromatin volume allowed quantitation of the 
total chromatin yield. 

ChiP assays of activated Th cells and nTregs were 
carried out in duplicate. An aliquot of chromatin (50 ug) 
was pre-cleared with protein G agarose beads 
(Invitrogen). FoxP3-bound genomic DNA regions were 
isolated using a goat polyclonal antibody against FoxP3 
(Abeam ab2481). After incubation at 4°C overnight, pro- 
tein G agarose beads were used to isolate the immune 
complexes. Complexes were washed, eluted from the 
beads with SDS buffer and subjected to RNase and pro- 
teinase K treatment. Crosslinks were reversed by incuba- 
tion overnight at 65° C and ChiP DNA was purified by 
phenol-chloroform extraction and ethanol precipitation. 
To assay for the enrichment of positive control region in 
the ChiP DNA, quantitative PCR (qPCR) reactions were 
carried out in triplicate with primers specific for these 
regions using SYBR Green Supermix (Bio-Rad). The re- 
sulting signals were normalized for primer efficiency by 
carrying out qPCR for each primer pair using Input 
DNA (data not shown). 

ChiP Sequencing (ChiP-seq) 

Remaining ChiP DNA (90% of entire sample) was ampli- 
fied using the Illumina ChiP-seq DNA Sample Prep Kit. 
In brief, DNA ends were polished and 5'-phosphorylated 
using T4 DNA polymerase, Klenow polymerase and T4 
polynucleotide kinase. After addition of 3'-A to the ends 
using Klenow fragments (3—5' exo~), Illumina genomic 
adapters were ligated and the sample was size-fractionated 
(-175-225 bp) on a 2% agarose gel. After a final PCR 
amplification step (18 cycles, Phusion polymerase), the re- 
sulting DNA libraries were quantified and tested by qPCR 
at the same specific genomic regions as the original ChiP 
DNA to assess quality of the amplification reactions. 
DNA libraries were sent to Illumina Sequencing Services 
for sequencing on a Genome Analyzer II resulting in ~14 
million quality-filtered sequences of length 35 per sample. 

mRNA Sequencing 

Total RNA from CD4 + Th and Treg cells was send to 
Fastens SA, Switzerland for sample preparation and 
Illumina GA sequencing. Poly-A was purified with oligo- 
dT magnetic beads (Dynabeads) and full length cDNA 
was prepared using Superscript III reverse transcriptase 
and RNAseH. The cDNA was fragmented by nebuliza- 
tion and then treated as described above for the ChiP-seq 
samples. After PCR amplification the library was quality 
controlled by cloning 1 ul into a TOPO plasmid and eight 
clones per library were Sanger sequenced. Libraries were 
quantified using Q-bit picogreen assay (Invitrogen) and 
BioAnalyzer (Agilent). Sequencing was performed on the 
Genome Analyzer II for 2 x 36 cycles (paired-ends) using 
sequencing kits version 2.0, 100 tiles per lane and data 
analysis pipeline GAPipeline-1.0rc4. 
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Bioinformatics analysis workflow 

Short read data from mRNA-seq and ChiP-seq experi- 
ments were mapped to the human genome (Ensembl 54) 
using the Genomatix Mapping Station (21) allowing up to 
4 mismatches and no gaps. Furthermore, mRNA-seq 
reads were mapped to a database of artificial splice junc- 
tions consisting of all pairs of exons in the human 
Ensembl 54 release. Mapping to splice junctions was per- 
formed using Bowtie (22) allowing for a maximum of two 
mismatches on either side of the junction and requiring at 
least 10 bases to match to one part of the junction. 

Expression levels for genes were obtained by summariz- 
ing all reads mapping to a gene in a sample and computing 
the RPKM value for a gene as proposed by Mortazavi 
et al. (23). RPKM values account for the different lengths 
of genes as well as for the different number of reads mea- 
sured in two experiments. Reads mapping multiple times 
in the genome were assigned to the respective genes pro- 
portional to the expression level of a gene as measured by 
unique reads. Fold changes were computed as the ratio 
of the RPKM values for measured for a gene in two 
samples and the significance of differential gene expres- 
sion was computed using the SAGEBetaBin method 
(24). Only genes with an absolute fold change >1.4 and 
a SAGEBetaBin significance score <0.01 were used for 
further steps. For the analysis of de-regulated genes 
measured by mRNA-seq, only genes with a minimal 
RPKM value of eight in at least one of the four conditions 
have been used. 

Significant clusters of FoxP3 bound sequences in ChiP- 
seq data were identified using MACS (14) and a window 
size of 300 bases, requiring for a P < 10e~ 6 . The minimal 
number of tags required for a cluster was determined from 
the input data. Significance values for the overlap between 
set of genes identified to be bound by FoxP3 in different 
species, i.e. mouse and human, were computed by the 
hypergeometric test. 

Results were interpreted in the context of biologic pro- 
cesses and functions, as well as networks and pathways 
through the use of Ingenuity Pathway Analysis (Ingenuity 
Systems Inc., Redwood City, CA, USA). 

RESULTS 

In the following, we will discuss the results of the expres- 
sion profiling (mRNA-seq) and the analysis of FoxP3 
binding-sites by chromatin-immunoprecipitation (ChiP- 
seq) experiments separately. The experimental setup is 
shown in Figure 1 . Finally, the integration of both data 
sets allowed us to describe a more comprehensive under- 
standing of the regulatory effects of FoxP3-binding in 
human activated Treg and CD4 + Th cells. 

Expression profiling of human Treg and CD4 + Th cells by 
mRNA-seq 

To analyze the transcriptomal landscape of primary 
human CD4 + Th and Treg cells, we sequenced the tran- 
scriptome of resting and activated Treg and CD4 + Th cells 
using Illumina's mRNA-seq method. Basic read mapping 
statistics are shown in Table 1 . Expression profiling using 



short read data was performed as described in the 
'Materials and Methods' section and an overview on the 
number of genes regulated in the different comparisons 
can be found in Table 2. Genes with a significant expres- 
sion level (RPKM > 8) in at least one population and a 
log2 ratio (LR) of larger than one were taken into 
account. The differences between two populations in a 
state-specific manner have been compared, i.e. resting 
Treg versus resting CD4 + Th cells as well as activated 



I y CD4 + Th cells Regulatory T cells 




Figure 1. Schematic overview of the experimental setup and the differ- 
ent human T cell populations used for RNA-seq and FoxP3 ChiP-seq 
analysis. 



Table 1. Read mapping statistics for mRNA-seq 



Sample 


All reads 


Mapped 


Exonic 


Splice 






reads 


reads 


junction 










reads 


Resting CD4 + Th 


15 277 248 


13 055 740 


9998 139 


551 483 


Activated CD4 + Th 


10335712 


8 397 379 


6473 155 


334714 


Resting Treg 


13 655 820 


11 567471 


8 817 608 


474 661 


Activated Treg 


8 237 652 


6 650525 


5407197 


281 711 



The table is showing the total number of single reads (paired end reads 
have been treated as two single reads), the number of reads mapping to 
the human genome, the number of exonic reads as well as the number 
of reads mapping to splice junctions. 



Table 2. Differentially expressed genes (DEGs) by comparing the 
RPKM values of separated T-cell populations 



Comparison 


LR > 1.0 


LR< -1.0 


LR > 1.0 


Treg act. /Treg rest. 


1056 


1059 


2115 


CD4 + Th act./CD4 + Th rest. 


924 


926 


1850 


Treg rest./CD4 + Th rest. 


171 


64 


235 


Treg act./CD4 + Th act. 


377 


408 


785 



Shown are the numbers of genes which are either up-regulated 
(LR>1.0) or down-regulated (LR<-1.0). The column LR>|1.0| 
shows the overall differentially expressed genes, LR = log2 ratio. 
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Treg versus activated CD4 + Th cells. Finally, the transi- 
tion from resting to the activated phenotype within a 
single cell type has been addressed. To qualify the confi- 
dence of the RNA-seq data, we proved the expression 
pattern of known T-cell markers, which have been 
described earlier as Treg-specific [e.g. FoxP3 (25,26), 
HPGD (27), CTLA4 (28), ZNFN1A2 (29,30)], as CD4 + 
Th cell specific [NKG7 (30), GZMB (30), CD40LG (31)] 
or as activation-specific [BIC (6), IRF4 (32), TNFRSF9 
(30)]: all of them confirmed the expected expression 
pattern (data not shown). In addition, we compared the 



expression pattern of all 2115 genes (detected by RNA-seq 
to be regulated by comparing activated Tregs versus rest- 
ing Treg cells) to the recently published mRNA expression 
profiling using Affymetrix Exon Arrays (E-TABM-779) 
(6). As the protocols for the purification of T-cell popula- 
tions, the stimulation and the RNA extraction has been 
exactly the same, the differences between these two studies 
allowed us to address the differences based on the applied 
technologies. As shown Figure 2A, only 406 (19%) 
genes have been identified to be de-regulated by both 
technologies (cut-off LR> |1.0|, e.g. FoxP3, Figure 2F). 
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Figure 2. Technology comparison of differentially expressed genes (Treg act. versus CD4 + Th cells). (A) Venn diagram for 2115 genes identified to be 
regulated by RNA-seq. Only 406 genes (19%) have been identified by the Affymetrix Exon Array technology, i.e. FoxP3 (F). (B) Analysis of reasons 
for not being detected by array technology. Of genes, 29% showed no regulation, i.e. PLEKHF1 (C); for 2% of genes no significant expression was 
detectable, i.e. HDC (D); only 42 genes showed a vice versa regulation pattern, i.e. RELB (E); 22% of genes the regulation could be confirmed by 
using a lower cuf-off for de-regulation (LR< |0.5|), i.e. PBX4 (G) and for 29% of genes no probeset was annotated at the Core data set using the 
Affymetrix Exon Arrays. The bar chart diagrams show in red for RNA-seq (in log2 RPKM values) and in green for Affymetrix Exon arrays (in log2 
values of RMA normalized expression) the relative expression pattern for all profiled T cell populations. RPKM = reads per kilobase of exon per 
million mapped sequence reads, LR = Log2 ratio. 
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We analyzed the remaining 1709 genes (81%) found to be 
regulated only by mRNA-seq and categorized the reasons 
for their absence (Figure 2B). 613 genes (29%) showed a 
significant expression level, but no regulation using the 
Affymetrix Exon Array technology (i.e. PLEKHF1, 
Figure 2C). 42 genes (2%) have not shown any significant 
expression levels using the Affymetrix Exon Arrays (i.e. 
HDC, Figure 2D). For only 8 genes a vice versa regulation 
has been observed (i.e. RELB, Figure 2E). 592 genes 
(28%) have not been measured on the microarray by 
probesets using the Affymetrix Exon Array core 
probeset annotation. By using the lower cuf-off of 
LR>|0.5|, additional 465 genes (22%) have been 
identified to be regulated also on the Exon Array 
platform (i.e. PBX4, Figure 2G). 

As recently described by Song et al. (33), also RNA-seq 
might miss transcripts at lower expression levels. Low ex- 
pression and potentially insufficient transcript sampling 
combined with high RPKM threshold result in differences 
of differentially expressed genes, when compared to micro- 
array experiments [645 genes are detected by arrays only 
(Figure 2A)]. The 'Materials and Methods' section states 
our very stringent threshold, that genes only with a 
RPKM > 8 in any of the four conditions have been incl- 
uded. To overcome the sensitivity issue in future experi- 
ments, a significant higher number of mapped reads will 
allow a decrease of the RPKM threshold. In summary, the 
gain of information achieved by applying RNA-seq 
weights out the loss of information by 2-fold in compari- 
son to the Affymetrix Exon arrays. 

Reasons for the discrepancies in results are mainly tech- 
nology associated. The differences between microarrays 
and RNA-seq protocols in IVT amplification versus 
PCR amplification, in analogous versus digital signal de- 
tection, in knowledge of sequences versus unbiased 
sequencing, in regard to cross-hybridization versus distin- 
guishing paralogous sequences, in spiked quantification 
versus absolute quantification of low-abundance tran- 
scripts have to be considered (17,34). In addition, the 
RNA-seq offers the identification of novel splice isoforms 
and the detection of SNPs. Still, for many applications, 
microarrays are the method of choice. Although the 
pricing for RNA-seq is reaching comparable levels, 
RNA-seq protocols still suffer from unknown biases 
such as those implied by the required ligation steps, and 
known biases where high abundance transcripts (house- 
keeping genes and/or ribosomal transcripts) make up the 
majority of sequencing data if not depleted (e.g. in some 
tissues 5% of the genes represent up to 75% of the reads 
sequenced). 

Next we compared the global gene expression pattern 
between activated Tregs and activated CD4 + Th cells. 
Therefore, we did not directly compare the gene expres- 
sion levels of activated Tregs versus activated CD4 Th 
cells. Instead of, we started to identify the genes which are 
regulated upon activation within the Treg populations 
(2115 genes) and CD4 + Th cell populations (1850 genes) 
independently. By comparing these two genesets, we were 
able to identify a geneset of 1036 genes, which are specif- 
ically regulated only in Treg cells (i.e. MS4A3, IL10, 
GARP (LRRC32), TNIP3 or IL-13; Figure 3A) and a 



geneset of 771 genes, which describe the genes specifically 
regulated in CD4 + Th cell activation (i.e. TNFSF13b, 
DDT, FOS, CXCL11 or NR4A2, Figure 3C). Finally, a 
common geneset of 1079 genes was defined, which 
contains genes regulated in Tregs as well as in CD4 + Th 
cells upon 16 h activation (i.e. SCD, IL22, FoxP3 or 
TSPAN2, Figure 3B). The three activation-specific 
genesets are shown as Venn diagram in Figure 3D and 
listed as Supplementary Table SI. 

The novel finding for 11-13 was proven by RT-PCR (not 
shown). Upon analyzing the RNA-seq data for additional 
CD4 + Th/CD4 memory cell marker gene expression 
(IFNg, IL4, CCR7, as well as CD62L), we are pretty con- 
fident that the IL-13 expression pattern did not originate 
from cell impurities, as it is expressed at that high levels in 
activated TREGs and did not show a major expression in 
activated Th cells. 

Subsequently, Ingenuity Pathway Analysis was used to 
analyze the activation-specific genesets to enrich influ- 
enced signaling pathways. We found 62 canonical signal- 
ing pathways scored with a significant — log \Q[P- value] > 2 
(Fisher's exact test) for at least one of the three genesets 
(Figure 3E). Upon activation, CD4 + Th cells significantly 
modulate IRF signaling, MAPK signaling, Tweak signal- 
ing, 11-6 signaling, NFkB activation, CD40 signaling, 
CD27 signaling and TNFR1 as well as TNFR2 signaling, 
whereas none of these canonical pathway were affected 
by activation of Tregs (Figure 3E). The other way 
around, Treg cells significantly modulate upon activation: 
Glucocorticoide receptor signaling, VEGF signaling, 
PI3K/AKT signaling, T-cell differentiation, PKCtheta sig- 
naling, NRF2 signaling, T-cell receptor signaling, ICOS 
signaling as well as the Protein Ubiquitination pathway 
(Figure 3E). 

Each modulated signaling pathway was overlaid by the 
expression data to visualize the biologic significance, i.e. 
the Protein Ubiquitination Pathway with the LR (log2 
expression changes) of Treg cells as well as with the LR 
of CD4 + Th cells as shown in Figure 3F or G. In CD4 + Th 
cells the proteasome is not affected by the activation 
thereof (Figure 3G). But specifically in activated Tregs a 
couple of the proteasome subunits are found to be strongly 
up-regulated (Figure 3F). The proteasome is a large 
multicatalytic complex which drives organized protein 
degradation. The proteasome plays a straightforward 
but critical role in the function of the adaptive immune 
system. Peptide antigens are displayed by the major histo- 
compatibility complex class I (MHC) proteins on the sur- 
face of antigen-presenting cells. These peptides are 
products of proteasomal degradation of proteins ori- 
ginated by the invading pathogen. Although constitutively 
expressed proteasomes can participate in this process, a 
specialized complex composed of proteins whose expres- 
sion is induced by interferon gamma produces peptides of 
the optimal size and composition for MHC binding. These 
proteins whose expression increases during the immune 
response include the 1 IS regulatory particle, whose main 
known biologic role is regulating the production of MHC 
ligands, and specialized (3 subunits called (3 1 i, P2i and (35i 
with altered substrate specificity. The complex formed 
with the specialized p subunits is known as the 
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Figure 3. Pathway Analysis of T cell activation. Shown are (A) genes specifically regulated upon activation of Treg cells. (B) Genes commonly 
regulated upon activation in both cell populations and (C) genes specifically regulated upon activation of CD4 + Th cells. (D) The Venn diagram 
shows the intersection of the activation signatures of CD4 + Th and Treg cells. (E) The three activation-specific genesets shown in Venn diagram have 
been used in Ingenuity Pathway Analysis to enrich influenced signaling pathways. Sixty two canonical signaling pathways scored with a significant 



(continued) 



7952 Nucleic Acids Research, 2011, Vol. 39, No. 18 



immunoproteasome (35). The immimoproteasome seems 
to be up-regulated in both Treg and CD4 + Th cells upon 
activation, whereas the 'standard' proteasome is only 
up-regulated in Tregs upon activation (Figure 3F or G). 
This Treg specific regulation might explain the observa- 
tion that long term culture of CD4 + Th cells in the 
presence of a small molecule inhibitor for LMP7 (also 
known as PSMB8, a catalytic subunit of the 
immunoproteasome) gave rise to a increased regulatory 
T-cell population (36). 

Figure 4 displays the some interesting marker genes dis- 
tinguishing Treg from CD4 + Th cells. The complete 
picture can be found in Supplementary Table SI. Those 
include general Treg marker genes like FoxP3, CTLA4, 
TLR4, IL2RA and IL2RB as well as Treg specific activa- 
tion markers [e.g. CABLES 1, GARP (LRRC32) and 
TNFRSF18 (GITR)] that are found to be specifically 
up-regulated in activated Tregs only. Some of those 
genes correspond to newly identified lineage markers 
(i.e. HDC, BTK and GATA2) which have not been 
described before. 

Finally, as shown in Figure 2F, FoxP3 is found to be 
expressed in all four samples. Low expression levels are 
found in CD4 + Th cells (resting expression level: 1.9 
RPKM, activated expression level: 4.6 RPKM). In 
contrast, expression in Treg cells is > 10-fold higher with 
an RPKM value of 27.4 in resting and 48.8 in activated 
Treg cells. Genes, whose promoters are bound by FoxP3 
and the implications of FoxP3 binding on gene expression 
are discussed in detail below. 

Identification of splice variant expression in 
mRNA-seq data 

Alternative splicing plays important roles in many 
biologic processes (37) and has also been described to be 
of importance in the immune system (38) and also during 
T-cell activation (39). In addition to the detection in gene 
expression levels, the mRNA-seq technique allows also the 
analysis for splice variants. Dependent of the complexity 
of gene-specific splicing behavior, present or absent tran- 
scripts are detectable for each sample. 

For IRAKI (Interleukin receptor 1 associated kinase), a 
serine/threonine kinase, three different splice variants with 
opposing function are known (38,40). In our data set we 
find two splice variants (full length IRAKI and IRAKlc) 
to be expressed. The full-length isoform of IRAKI is de- 
tectable only in resting cell states of Tregs and CD4 + Th 
cells. Upon activation of these cells, exon 1 1 seems to be 
skipped: surprisingly, only the truncated isoform IRAKlc 
is detectable in activated CD4 + Th cells (Figure 5A). The 
skipped exon encodes to the C-terminal part of the kinase 
domain. This smaller variant isoform therefore likely lacks 
kinase activity, although it retains the ability to interact 



with many IRAKI -binding partners. It may function as a 
negative-feedback loop in which IRAKI function is 
decreased in response to IL-1 stimulation. The presence 
and regulation of the truncated form of IRAKI was 
proven indirectly by qRT-PCR using Roche's UPL tech- 
nology (Supplementary Figure SI). 

SAM68 (shown in Figure 5B) is a protein which is re- 
cruited to the T-cell receptor and, once phosphorylated, 
functions as an adapter protein in signal transduction 
cascades by binding to SH2 and SH3 domain-containing 
proteins. In our data set, splicing events skipping the KH 
domain (PF00013), a domain known to be important for 
RNA binding, occurs only in resting CD4 + Th and Treg 
cells but is not observed in activated states. Variants 
lacking the KH domain have been shown to be expressed 
in growth-arrested cells only and are known to inhibit S 
phase (41). 

For AUF1 (Heterogeneous nuclear ribonucleoprotein 
DO), shown in Figure 5C, different rates for the inclusion 
and exclusion of a Glycin-rich region are found. This 
region was shown to be important for the RNA-binding 
function (42). While in resting CD4 + Th and Treg cells 
both isoforms (with and/or without exon7) seem to be 
expressed at equal levels, a clear shift towards the 
isoform skipping exon7 is detectable in activated Treg 
cells. Evidence is shown in Figure 5 by the expression 
level of exon 7 and the number of junction reads voting 
for either variant (data not shown). The presence, 
majority and the regulation of the truncated form of 
AUF1 was confirmed by qRT-PCR using Roche's UPL 
technology (Supplementary Figure S2). 

Two different splice variants have been described for 
FoxP3 in human (43). One isoform originates from an 
in-frame exon skipping event of exon 3 (protein coding 
exon 2) leading to a protein product which lacks 34 
amino acids. Neither the forkhead nor the zinc-finger 
domains are affected by this event. Both variants have 
also been shown to be functional (43) such that their 
biologic role still has to be elucidated. In our mRNA- 
seq data, both variants are detected as being present in 
resting Treg cells. In activated Treg cells the number of 
reads mapping the 5'-end of the gene is very small and, 
likely due to missing data, no indication for the skipping 
event is found. 

Other proteins for which differentially expressed splice 
variants are detected are IL2RA, LEF1 (both express 
mixtures of transcripts) and FYN. Specifically between 
resting and activated cell states for FYN different inclu- 
sion rates are found. Overall, this data displays the poten- 
tial of mRNA-seq data to enable the analysis of state and 
cell specific splice variant expression and highlights the 
fact that interesting and important functions during 
CD4 + Th and Treg activation may be mediated and per- 
formed through differentially expressed isoforms. 



Figure 3. Continued 

— loglOfP-value] > 2 for at least one of the three genesets. (F) The overlay of the epression data as LRs to the Protein Ubiquitination Pathway 
identified a couple members of the protease to be specifically up-regulated in activated Tregs, whereas no modulation of these has been observed after 
activation of CD4 + Th cells (G). The bar chart diagrams show the relative expression pattern for all in RNA-seq experiments profiled T cell 
populations in log2 RPKM values. RPKM = reads per kilobase of exon per million mapped sequence reads, LR = Log2 ratio. 
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Figure 4. Set of genes and their subcellular localization which are up-regulated in Treg cells compared to CD4 + Th cells. Genes shown in green are 
up-regulated in activated and resting Treg cells; orange genes are up-regulated upon activation of Treg cells but are not up-regulated upon CD4 + Th 
activation. The red stars mark genes which are bound by FoxP3 in activated Treg cells. 



Identification of Foxp3 binding sites in human T-cell 
populations and comparison to array-based 
measurements in mouse and human 

We performed ChiP-seq to identify genomic regions which 
are bound by FoxP3 in activated CD4 + Th and activated 
Treg cells of two human donors (biologic replicates). 
FoxP3-bound genomic DNA fragments from CD4 + Th 
and Treg cells were sequenced as described in the 
'Material and Methods' section. Short sequence reads 
were mapped to the human genome and significant 
peaks, i.e. clusters of reads, were identified using MACS 
(14) and the cluster statistics are displayed in Table 3. In 
both cell populations and biologic replicates a large 
number of clusters were found, with many more clusters 
in Treg cells compared to CD4 + Th cells, likely due to 
higher expression levels of FoxP3. Clusters are mapped 
to intergenic, intronic, exonic or known promoter 
regions and, as shown in Figure 6, promoter regions 
appear to be highly enriched. They are mapped 14 to 24 
times more often than expected by chance. As shown in 
Table 3, 32-51% of all clusters are mapped to known 
promoter sites (Ensembl 54). Genes detected in different 
biologic replicates overlap to a large extent. With respect 
to the smaller data set, 1264 (80.7%) of the CD4 + Th cell 
genes and 5627 (88.7%) of the Treg genes are identified in 



both biologic replicates indicating that the experiment in 
general returns highly reproducible results. 1170 genes 
(74% of the genes identified in sample Al) are detected 
in all four samples to be bound by FoxP3. Further, 1317 
genes are found in A2 as well as Bl and B2. Those 2487 
genes are likely to represent a set of genes which is bound 
by FoxP3 in both cell types. Only 26 genes are found in 
CD4 + Th cells only (samples Al and A2). In contrast, 
2897 genes are identified whose promoters are only 
bound by FoxP3 in Treg cells and which are candidates 
for a Treg specific regulatory effect of FoxP3. 

To further evaluate the quality of the FoxP3 clusters ob- 
tained in our experiment we tested to what extent ChiP- 
seq clusters are associated to DNAse I hypersensitive sites 
from human primary CD4 + T-cells which have been pub- 
lished by Boyle et al. (44). DNAse I hypersensitive sites 
have been historically used to identify the location of regu- 
latory regions, including enhancers, silencers, promoters 
and other locus control regions and can therefore provide 
additional evidence regarding data quality. The DNAse I 
sites were mapped onto the human genome and were sub- 
sequently clustered by MACS. As shown in Figure 6, 
DNAse I hypersensitive sites found in all four genomic 
categories but, just like FoxP3 clusters, they are highly 
enriched in promoter regions. The overlap between 
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Figure 5. Examples for the detection of splice variants in Treg and CD4 + Th cells. Exons of the genes are shown in blue, read coverage at a specific 
position is indicated by the orange bars. Examples for known transcript models of a gene are shown below the gene structures where solid lines 
indicate that a specific splice junction has been identified in the data while dotted lines indicate that a junction has not been found. Green bars 
indicate the location of specific protein domains. (A) displays changes in the expression of exon 1 1 between resting and activated CD4 + Th cells 
corresponding to the partial deletion of the kinase domain (shown by the green bar) of IRAKI. (B) For the SAM68 gene, exon skipping of exon 3 
which represents a KH RNA binding domain (PF00013, indicated by the green bar) is only observed in resting Treg and CD4 + Th cells and never in 
activated states. (C) For AUF1 inclusion or exclusion of a Glycin-rich region (shown by the green bar) is found to occur in different levels in 
activated Treg and CD4 + Th cells compared to resting states. 




FoxP3 clusters and DNAse I clusters is even higher than 
the overlap of FoxP3 clusters with promoter regions. 
Between 55 and 68% of all FoxP3 clusters overlap with 
DNAse I clusters. Therefore, a larger fraction of FoxP3 
binding sites which do not map to known promoter sites 



might be bound in other regions of regulatory relevance 
(i.e. cis- and/or trans-active regulatory elements like enhancer 
or silencers). 

Using the Genomatix collection of transcription factor 
binding matrices, we searched for other, enriched 
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Table 3. FoxP3 cluster statistics for CD4 + Th and Treg cell populations obtained by Illumina ChiP-seq in two biologic 
replicates 
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Figure 6. Cluster statistics for human FoxP3 clusters identified by 

MACS which shows the high enrichment of promoter elements 

overlapping with MACS clusters. Biologic replicates of isolated 

human CD4 + T-cells (Al and A2) and human regulatory T-cells 
(Bl and B2). 



transcription factor binding motifs in promoter regions 
bound by FoxP3. In both biologic replicates (Bl and B2) 
several transcription factor binding matrices are identified 
in FoxP3 bound regions more often than expected. 
Among the top scoring matrix families, four appear to 
be especially interesting in the biologic context. This 
includes the ETS-1 as well as the HAML (human acute 
myelogenous leukemia) factor family. Cooperative 
binding of members of the two families has been described 
in several cases in the context of T-cells including the co- 
operative binding of AML1 and Ets-1 to the TCR alpha 
enhancer (45) as well as composite binding of Ets-1 and 
CBF (another member of the HAML family) in the regu- 
lation of antigen receptor genes (46). Futhermore, binding 
sites for AP1F are overrepresented in FoxP3 bound 
regions. Members of this family are known to interact 
with both HAML and ETSF in various cases including 
the activation of GM-CSF in T-cells (47) and FoxP3 has 
been shown to suppress AP-1 transcriptional activity (48). 
Finally, interferon regulatory factor (IRFF) binding sites 
are overrepresented. A member of this family, IRF4, has 
been described to interact with multiple Ets factors to 
modulate CD68 expression in a cell type-specific manner 
(49). Also, it has only recently been shown (50) that in 
mouse Treg cells, high amounts of interferon regulatory 
factor-4 (IRF4) is dependent on Foxp3 expression. The 
authors proposed that IRF4 expression endows Treg 
cells with the ability to suppress TH2 responses a fact 
which might be confirmed by the over-representation of 
IRFF binding sites in FoxP3 bound genes such that col- 
laborative binding might occur. The NFAT transcription 



factor motif which has been shown previously to interact 
with FoxP3 (51) is also overrepresented although less sig- 
nificantly than the four matrices described above. 

Next, we tested to what extent genes bound by FoxP3 in 
our human data set have also been identified in two 
studies by Zheng et al. (8) and Marson et al. (9) that ana- 
lyzed FoxP3 occupancy in mouse and one recent study by 
Sadlon et al. (10) who identified human FoxP3 targets 
using ChiP-on-Chip experiments. In the Zheng data set, 
1277 FoxP3 binding sites are annotated which correspond 
to 519 mouse genes in Ensembl 54. Out of those 454 can 
be mapped to a corresponding human ortholog and are 
used in subsequent analysis steps. In the Marson data set, 
1335 binding sites in unstimulated and 1119 binding sites 
in stimulated hybridoma cells were identified (at a false 
discovery rate of 5%) which map to 1265 and 1055 
distinct mouse genes. Out of those 1162 genes in unstimu- 
lated and 960 genes in stimulated cells have an annotated 
ortholog in the human genome. The 5579 genes identified 
by Sadlon et al. (10) in human correspond to 4185 human 
genes in Ensembl 54. The overlap between the genes in the 
three published data sets and the 5627 human FoxP3 
bound genes in Treg cells is shown in Figure 7. It is inter- 
esting to note that the overlap of the two published mouse 
data sets is relatively small. Only 107 genes are detected in 
both data sets. Although this overlap is highly significant 
(P < 1.66e~ 42 ) it shows that even within the same species 
larger differences in genes detected to be bound by FoxP3 
can be expected either due to the experimental design like 
cell lines or stimulation or due to genetic and biologic 
variance. Human FoxP3 bound genes in our data set sig- 
nificantly overlap with mouse genes. The human data set 
contains 181 genes detected in the Zheng study 
(P<1.91e -14 ) and 561 genes from the Marson data set 
(P < 3.453e~ 77 ). 58 genes are found in all three studies and 
743 genes are found in at least two data sets (49 of which 
are only found in the mouse). The largest and most sig- 
nificant overlap is found for the two human data sets. 
1586 genes are consistently identified in both data sets as 
FoxP3 target genes (P< 1.65c- 1 12 ) and are therefore con- 
sistent between two different experimental setups and two 
different technologies. 

The effects of FoxP3 binding onto gene expression in 
human T-cells 

As described above, 2897 genes are specifically bound by 
FoxP3 in activated Treg cells, 26 only in CD4 + Th cells 
and 2487 genes are bound in both cell types. We integrated 
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this information with expression data from the mRNA- 
seq experiment of activated CD4 + Th and activated Treg 
cells, as discussed above. 3255 genes have been identified 
to be significantly regulated (absolute fold-change >1.4 
and significance value <0.01) between activated CD4 + 
Th and Treg cells by mRNA-seq analysis. 1451 (44%) of 
those genes are also bound by FoxP3 either in Treg only 
(800 genes, 399 up-regulated, 401 down-regulated) or in 
Treg and CD4 + Th cells (647 genes, 362 upregulated, 285 
downregulated). The fact that 44% of all regulated genes 
between activated Treg and CD4 + Th cells are bound by 
FoxP3 proofs its important role for activated Treg cells. 
The overlap is statistically highly significant 
(P<4.55e~ 19 ). Marson et al. (9) described a highly sup- 
pressive function of FoxP3 in their data set where many 
more genes were down-regulated than up-regulated upon 
FoxP3 binding, a view which has also been taken by 
Ziegler et al. (2). In contrast to those findings and in cor- 
respondence with the Zheng study (8), our human data set 
suggests that activation and repression of genes by FoxP3 
seem to be well balanced. 

The gene which shows the highest up-regulation 
(17-fold) in Treg cells is CTLA4 which is bound by 
FoxP3 according to ChiP-seq data in two clusters as 
shown in Figure 8A. Other examples are IL1R1 (up- 
regulation: 13-fold), DUSP2, DUSP4 and DUSP6 (up to 
8-fold) as well as CXCR6 and CCR4 (up to 4-fold). 
Interestingly, FoxP3 binds to its own promoter in Treg 
cells and may therefore regulate its own expression levels. 



On the other hand, 401 genes bound by FoxP3 are sig- 
nificantly down-regulated in Treg cells. In our data set 
those genes include DPEP2 (60-fold down-regulated), 
ITGA4 (6-fold), TLR1 (7-fold), IFIT2 (8-fold), TRAT1 
(3-fold) which is a positive regulator of the T-cell 
receptor signaling pathway as well as members of the 
TRIM (tripartite motif-containing) family (TRIM21, 
TRIM22, TRIM38, TRIM56). TRIM21 is known to 
increase IL2 production in T-cells (52). 

2487 genes are bound by FoxP3 in activated CD4 + Th 
and Treg cells. Out of those, 647 (26%) genes appear to be 
significantly de-regulated in the comparison between the 
two cell types. Although there is no general correlation 
between binding strength (as measured by the number of 
reads found in a promoter cluster) and the resulting fold 
change in the gene expression measurement (see 
Supplementary Table S2), for 503 of those 647 genes 
(78%) binding strength is at least 2-fold higher in Treg 
cells compared to CD4 + Th cells. 362 genes are 
up-regulated in Treg cells while 285 genes are 
down-regulated. For example, IL16 and LY9 (lymphocyte 
antigen 9), which are both shown in Figure 8B and C, as 
well as IL7R appear to be down-regulated, while e.g. 
RGS1, IL2RB, FOS and FOSL2 (both part of the AP-1 
transcription complex) are up-regulated. 

Interestingly, the up-regulation of proteasome compo- 
nents in Treg cells discussed above clearly seems to be 
mediated by FoxP3 as many proteasome genes are 
bound by FoxP3 in their promoter regions (see also 
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Figure 8. Examples for the modulation of transcript expression by FoxP3. Exons of the genes are shown in blue, read coverage at a specific position 
is indicated by the red (activated Treg cells) and blue (activated CD4 + Th cells) lines. Examples for known transcript models of a gene are shown 
below the gene structure. Green arrows indicate positions of significant FoxP3 binding site peaks identified in activated Treg cells of both donors. 
(A) shows the up-regulation of CTLA4 likely mediated by two FoxP3 clusters, (B) LY9 is bound by FoxP3 at multiple sites in both cell lines but 
binding strength is lOx higher in Treg than in CD4 + TH cells, leading to the almost complete repression of the gene in Treg cells, (C) Short isoforms 
of the IL16 gene are known to be specifically expressed in T-cells. Expression and repression of the short variant of the gene may be mediated by 
FoxP3 which is bound to multiple sites of the gene including the promoter site of the short isoform in both cell types. 



Figure 4). Also, as shown in Figure 8C for the example 
of IL16, FoxP3 may also be involved in the use of 
specific alternative transcription start sites in order to 
mediate expression of the T-cell specific isoform of the 
gene. 



DISCUSSION 

Treg cells play an essential role in the negative regulation 
of immune answers and FoxP3 has been shown to be one 
of the major regulator genes for developing the Treg gene 
signature. But despite many research in the field the mech- 
anisms leading to the specific gene expression patterns 
are not completely understood. The last years have also 
shown that transcription factor binding may differ strongly 
from species to species such that results achieved in one 
model organism may not necessarily be transferred to 
others. 



In this study we have analyzed for the effects of FoxP3 
transcription factor binding on the development of the 
Treg gene signature in humans using the most recent 
and also most sensitive technological platforms available. 
We combined two approaches relying on next generation 
sequencing, namely mRNA-seq and ChiP-seq, which, in 
combination, allow for very detailed insights into the 
complex transcriptomal landscape of higher eukaryotes. 

We have performed expression profiling of resting and 
activated Treg and CD4 + Th cells and identified gene sig- 
natures that are specifically de-regulated in the different 
cell types (Treg and CD4 + Th signature) or are regulated 
in both cell types upon activation (Activation signature). 
Signatures were compared to sets of genes identified in a 
similar study using Affymetrix Exon Arrays. In corres- 
pondence with other authors we find that while the key 
players are identified by both technologies, NGS can iden- 
tify many more de-regulated genes likely to its higher sen- 
sitivity and the fact that any gene can be profilied without 
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having to rely on pre-defined sets of probesets. Analysis of 
de-regulated pathways shows significant differences be- 
tween Treg and CD4 + Th cells not only on the level of 
single genes. 

Furthermore, using NGS data we can identify differ- 
ences in splicing patterns like the expression of a 
kinase-dead isoform of the IRAKI gene upon T-cell acti- 
vation. Although it is not yet clear what function this 
kinase-dead isoform has in vivo, it is interesting to note 
that expression of the small isoform has been shown to be 
specific for activated Treg and CD4 + Th cells whereas 
both cell types express the full length isoform in resting 
states. Overexpression of IRAKUc in 293T or G292 cells 
suppressed NF-kB activation and blocked IL- lb-induced 
IL-6 as well as LPS and CpG-induced TNFa production. 
Mechanistically evidence was shown that IRAKlc func- 
tions as a dominant negative by failing to be phosphory- 
lated by IRAK4, thus remaining associated with Tollip 
and blocking NF-kB activation (40). The presence of a 
regulated, alternative splice variant IRAKlc in activated 
T cells, that functions as a kinase-dead, dominant-negative 
protein adds further complexity to the variety of mechan- 
isms that regulate TIR signaling and the subsequent in- 
flammatory response. 

By using ChiP-seq we identified genes bound by FoxP3 
in activated Treg and CD4 + Th cells. By identifying 
promoter regions with significant FoxP3 peaks in short 
read data we obtained a set of more than 5000 genes 
which may be subject to a FoxP3 mediated regulation of 
the corresponding gene. Genes identified to be bound in 
human show a highly significant overlap with two pub- 
lished FoxP3 data sets in the mouse, although many more 
genes were identified in human. This is likely due to the 
higher sensitivity of ChiP-seq compared to array-based 
detection methods which again require a priori know- 
ledge. Promoter regions bound by FoxP3 additionally 
show enrichment for other transcription factor binding 
sites like ETS-1, HAML, API and IRFF (including 
IRF4) which are novel candidates for a cooperative 
function in controling T-cell development in human. 
Especially the correlation of FoxP3 binding sites with 
IRF4 appears to be interesting, as IRF4 has only recently 
been shown to be transcriptionally dependent on FoxP3. 

In a last step, we combined both data sets in order to 
understand the regulatory effects of FoxP3 on the specific 
gene expression signature of Treg cells. More than 40% of 
the genes which appear to be de-regulated in activated 
Treg cells compared to activated CD4 + Th cells are 
bound by FoxP3 in their promoter regions showing the 
large importance of this transcription factor for develop- 
ing the Treg signature. In contrast to some previous 
studies which proposed the major function of FoxP3 as 
a suppressor, we find that activation and repression of 
genes by FoxP3 are likely to be well balanced in human. 
Nobody before profiled that comprehensively human 
primary T cell populations. The functional relevance of 
the identified FoxP3-bound genes has been confirmed by 
earlier studies. In 2009, we published the functional rele- 
vance of FoxP3-bound miR-155 in sensititation CD4 + Th 
Cells for TREG mediated suppression (53). In addition, 
we (in human) and others (in mouse) could identify 



PDE3b as a FoxP3 target gene (54). PDE3b is involved 
in controlling the abundance of cyclic adenosine mono- 
phosphate (cAMP) and cAMP is known to suppress 
T cell function. Because Foxp3 blocks PDE3B expression, 
regulatory T cells may contain higher amounts of cAMP, 
which may play an important role in regulatory T cell 
function (54,55). Further examples for TREG-mediated 
T cell suppression of FoxP3 target genes have been 
described: DAB2 (56), CTLA4 (57) and EBB (IL-27/ 
IL-35) (58). Beyond these selected examples, we did inte- 
grate the human FoxP3 occupancy and expression data 
with expression profiles derived from T cell populations 
of scurfy mice (FoxP3-deficient) (59,60). We downloaded 
the raw data (E-GEOD- 11775), normalized, orthologous 
aligned (mouse to human) and integrated them with 
FoxP3-bound genes. A significant overlap of FoxP3- 
bound genes (ChiP-seq), which have been identified to 
be regulated upon T cell activation in human (RNA- 
seq), revealed no major regulation in activated T cell 
populations from scurfy mice (data not shown). 
Although there are differences in FoxP3 function and ex- 
pression between human and mouse (61), by that kind of 
analysis we have been able to show indirect proof of func- 
tional relevance for the FoxP3-bound genes. 

Taken together our study provides a comprehensive 
data set which allows analyzing the FoxP3 dependent de- 
velopment of the Treg gene signature as well as the differ- 
ences between Treg cell and CD4 + Th cells in general. The 
use of next generation sequencing provides significantly 
deeper insights into the complex transcriptomal landscape 
of higher eukaryotes compared to array-based tech- 
nologies and will help to further sharpen our understand- 
ing of transcriptional regulation of gene expression by 
transcription factors in the future. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online. The 
raw sequencing data have been uploaded to http://www. 
ncbi.nlm.nih.gov/sra under the accession number 
SRP006674. 
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